Markov Decision Problems Where Means Bound Variances

نویسندگان
چکیده

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Markov Decision Problems Where Means Bound Variances

We identify a rich class of finite-horizon Markov decision problems (MDPs) for which the variance of the optimal total reward can be bounded by a simple affine function of its expected value. The class is characterized by three natural properties: reward boundedness, existence of a do-nothing action, and optimal action monotonicity. These properties are commonly present and typically easy to ch...

متن کامل

Markov Decision Problems

Markov Decision Problems (MDPs) are the foundation for many problems that are of interest to researchers in Artificial Intelligence and Operations Research. In this paper, we will review what is known about algorithms for solving MDPs as well as the complexity of solving MDPs in general. We will argue that, even though there are theoretically efficient algorithms for solving MDPs, these algorit...

متن کامل

Linearly-solvable Markov decision problems

Advances in Neural Information Processing Systems 2006 We introduce a class of MPDs which greatly simplify Reinforcement Learning. They have discrete state spaces and continuous control spaces. The controls have the effect of rescaling the transition probabilities of an underlying Markov chain. A control cost penalizing KL divergence between controlled and uncontrolled transition probabilities ...

متن کامل

Denumerable Constrained Markov Decision Problems and Finite Approximations Denumerable Constrained Markov Decision Problems and Finite Approximations

The purpose of this paper is two fold. First to establish the Theory of discounted constrained Markov Decision Processes with a countable state and action spaces with general multi-chain structure. Second, to introduce nite approximation methods. We deene the occupation measures and obtain properties of the set of all achievable occupation measures under the diierent admissible policies. We est...

متن کامل

Lower Bound On the Computational Complexity of Discounted Markov Decision Problems

We study the computational complexity of the infinite-horizon discounted-reward Markov Decision Problem (MDP) with a finite state space S and a finite action space A. We show that any randomized algorithm needs a running time at least Ω(|S||A|) to compute an -optimal policy with high probability. We consider two variants of the MDP where the input is given in specific data structures, including...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Operations Research

سال: 2014

ISSN: 0030-364X,1526-5463

DOI: 10.1287/opre.2014.1281